Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor

Identifieur interne : 000E59 ( Main/Exploration ); précédent : 000E58; suivant : 000E60

Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor

Auteurs : Guido Sautter [Allemagne] ; Klemens Böhm [Allemagne] ; Frank Padberg [Allemagne] ; Walter Tichy [Allemagne]

Source :

RBID : ISTEX:79233416758986A23C3D805E917E2AB681EC3199

Abstract

Abstract: Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have developed the GoldenGATE editor, a tool that integrates NLP applications and assistance features for manual XML editing. Plain XML editors do not feature such a tight integration: Users have to create the markup manually or move the documents back and forth between the editor and (mostly command line) NLP tools. This paper features the first empirical evaluation of how users benefit from such a tight integration when creating semantically rich digital libraries. We have conducted experiments with humans who had to perform markup tasks on a document collection from a generic domain. The results show clearly that markup editing assistance in tight combination with NLP functionality significantly reduces the user effort in annotating documents.

Url:
DOI: 10.1007/978-3-540-74851-9_30


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor</title>
<author>
<name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
</author>
<author>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
</author>
<author>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
</author>
<author>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:79233416758986A23C3D805E917E2AB681EC3199</idno>
<date when="2007" year="2007">2007</date>
<idno type="doi">10.1007/978-3-540-74851-9_30</idno>
<idno type="url">https://api.istex.fr/document/79233416758986A23C3D805E917E2AB681EC3199/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000478</idno>
<idno type="wicri:Area/Istex/Curation">000471</idno>
<idno type="wicri:Area/Istex/Checkpoint">000874</idno>
<idno type="wicri:doubleKey">0302-9743:2007:Sautter G:empirical:evaluation:of</idno>
<idno type="wicri:Area/Main/Merge">000E72</idno>
<idno type="wicri:Area/Main/Curation">000E59</idno>
<idno type="wicri:Area/Main/Exploration">000E59</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor</title>
<author>
<name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2007</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">79233416758986A23C3D805E917E2AB681EC3199</idno>
<idno type="DOI">10.1007/978-3-540-74851-9_30</idno>
<idno type="ChapterID">30</idno>
<idno type="ChapterID">Chap30</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have developed the GoldenGATE editor, a tool that integrates NLP applications and assistance features for manual XML editing. Plain XML editors do not feature such a tight integration: Users have to create the markup manually or move the documents back and forth between the editor and (mostly command line) NLP tools. This paper features the first empirical evaluation of how users benefit from such a tight integration when creating semantically rich digital libraries. We have conducted experiments with humans who had to perform markup tasks on a document collection from a generic domain. The results show clearly that markup editing assistance in tight combination with NLP functionality significantly reduces the user effort in annotating documents.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>Bade-Wurtemberg</li>
<li>District de Karlsruhe</li>
</region>
<settlement>
<li>Karlsruhe</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Bade-Wurtemberg">
<name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
</region>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000E59 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000E59 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:79233416758986A23C3D805E917E2AB681EC3199
   |texte=   Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024